*Copyright Pierian Data 2017*
*For more information, visit us at www.pieriandata.com*

Rolling and Expanding

A very common process with time series is to create data based off of a rolling mean. Let's show you how to do this easily with pandas!


In [1]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
# Best way to read in data with time series index!
df = pd.read_csv('time_data/walmart_stock.csv',
                 index_col = 'Date',
                 parse_dates = True)

In [3]:
df.head()


Out[3]:
Open High Low Close Volume Adj Close
Date
2012-01-03 59.970001 61.060001 59.869999 60.330002 12668800 52.619235
2012-01-04 60.209999 60.349998 59.470001 59.709999 9593300 52.078475
2012-01-05 59.349998 59.619999 58.369999 59.419998 12768200 51.825539
2012-01-06 59.419998 59.450001 58.869999 59.000000 8069400 51.459220
2012-01-09 59.029999 59.549999 58.919998 59.180000 6679300 51.616215

In [4]:
df['Open'].plot(figsize=(16, 6))


Out[4]:
<matplotlib.axes._subplots.AxesSubplot at 0x2beb3c7bba8>

Now let's add in a rolling mean! This rolling method provides row entries, where every entry is then representative of the window.


In [5]:
# 7 day rolling mean
df.rolling(7).mean().head(20)


Out[5]:
Open High Low Close Volume Adj Close
Date
2012-01-03 NaN NaN NaN NaN NaN NaN
2012-01-04 NaN NaN NaN NaN NaN NaN
2012-01-05 NaN NaN NaN NaN NaN NaN
2012-01-06 NaN NaN NaN NaN NaN NaN
2012-01-09 NaN NaN NaN NaN NaN NaN
2012-01-10 NaN NaN NaN NaN NaN NaN
2012-01-11 59.495714 59.895714 59.074285 59.440000 9.007414e+06 51.842984
2012-01-12 59.469999 59.744285 59.007143 59.321429 8.231357e+06 51.739567
2012-01-13 59.322857 59.638571 58.941428 59.297143 7.965071e+06 51.718386
2012-01-17 59.397143 59.708571 59.105714 59.358572 7.355329e+06 51.771963
2012-01-18 59.450000 59.791428 59.217143 59.502857 7.047043e+06 51.897808
2012-01-19 59.578572 59.960000 59.335715 59.707143 7.412086e+06 52.075984
2012-01-20 59.767143 60.180000 59.577143 59.988571 7.908014e+06 52.321443
2012-01-23 60.017143 60.387143 59.787143 60.204285 8.017800e+06 52.509586
2012-01-24 60.154286 60.672857 59.979999 60.474285 8.035857e+06 52.745077
2012-01-25 60.440000 60.958572 60.270000 60.749999 7.776786e+06 52.985553
2012-01-26 60.715714 61.205714 60.448571 60.910000 7.624814e+06 53.125103
2012-01-27 60.868572 61.361429 60.575714 61.010000 7.678514e+06 53.212323
2012-01-30 60.945715 61.445714 60.661428 61.108571 7.450271e+06 53.298295
2012-01-31 61.057143 61.491429 60.648571 61.158571 7.362086e+06 53.341905

In [6]:
df['Open'].plot()
df.rolling(window = 30).mean()['Close'].plot()


Out[6]:
<matplotlib.axes._subplots.AxesSubplot at 0x2beb3dcbf28>

Easiest way to add a legend is to make this rolling value a new column, then pandas does it automatically!


In [7]:
df['Close: 30 Day Mean'] = df['Close'].rolling(window = 30).mean()
df[['Close','Close: 30 Day Mean']].plot(figsize = (16,6))


Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x2beb3e335f8>

expanding

Now what if you want to take into account everything from the start of the time series as a rolling value? For instance, not just take into account a period of 7 days, or monthly rolling average, but instead, take into everything since the beginning of the time series, continuously:


In [8]:
# Optional specify a minimum number of periods
df['Close'].expanding(min_periods=1).mean().plot(figsize = (16, 6))


Out[8]:
<matplotlib.axes._subplots.AxesSubplot at 0x2beb3fbde80>

Bollinger Bands

We will talk a lot more about financial analysis plots and technical indicators, but here is one worth mentioning!

More info : http://www.investopedia.com/terms/b/bollingerbands.asp

Developed by John Bollinger, Bollinger Bands® are volatility bands placed above and below a moving average. Volatility is based on the standard deviation, which changes as volatility increases and decreases. The bands automatically widen when volatility increases and narrow when volatility decreases. This dynamic nature of Bollinger Bands also means they can be used on different securities with the standard settings. For signals, Bollinger Bands can be used to identify Tops and Bottoms or to determine the strength of the trend.

Bollinger Bands reflect direction with the 20-period SMA and volatility with the upper/lower bands. As such, they can be used to determine if prices are relatively high or low. According to Bollinger, the bands should contain 88-89% of price action, which makes a move outside the bands significant. Technically, prices are relatively high when above the upper band and relatively low when below the lower band. However, relatively high should not be regarded as bearish or as a sell signal. Likewise, relatively low should not be considered bullish or as a buy signal. Prices are high or low for a reason. As with other indicators, Bollinger Bands are not meant to be used as a stand alone tool.


In [9]:
df['Close: 30 Day Mean'] = df['Close'].rolling(window=20).mean()
df['Upper'] = df['Close: 30 Day Mean'] + 2 * df['Close'].rolling(window = 20).std()
df['Lower'] = df['Close: 30 Day Mean'] - 2 * df['Close'].rolling(window = 20).std()
df[['Close','Close: 30 Day Mean','Upper','Lower']].plot(figsize = (16, 6))


Out[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x2beb3faba58>

For expanding operations, it doesn't help very much to visualize this against the daily data, but instead its a good way to get an idea of the "stability" of a stock. This idea of stability and volatility is something we are going to be exploring heavily in the next project, so let's jump straight into it!